Conveying Spatial Information Using Vision and Natural Language
نویسندگان
چکیده
Language Alicia Abella John R. Kender Department of Computer Science Columbia University New York, NY 10027 Abstract Technology has made it possible to gather a great deal of data in the form of images, and at the same time it has created a demand for extracting what is conveyed in these images. To extract what is in these images requires vision, and to convey what is extracted requires natural language. This paper examines the problem of how computer vision and natural language processing can be used to address the problem of object localization in a 2D image. The ultimate goal is a system capable of generating descriptions that relate the spatial arrangement of the objects through the use of spatial prepositions. Our previous work described a computational model of preposition semantics and a method for handling some of the ambiguities associated with natural language. In this paper we extend that work by introducing the theoretical methodologies for generating locative expressions that represent the spatial relationship of objects in an image. Introduction The integration of vision and natural language has recently emerged as an area of crucial importance to tasks that involve communicating visual information. Figure 1 illustrates the current state of a airs. There is a signi cant amount of theoretical results and systems built in the areas of computer vision and natural language processing. The area formed by their overlap, however, contains comparably few theoretical results, and even fewer successful implementations that utilize vision and natural language. This paper presents a semantic representation of spatial prepositions based on an image's visual properties for the purpose of describing the spatial relationship of objects in an image. This semantic representation relies on object properties extracted from the image using standard vision techniques, see e.g.[Horn, 1990]. These properties include an object's area, center of mass, and elongation. An early attempt at de ning a model for language understanding and developing a system to test it is il?
منابع مشابه
Iconic Gestures with Spatial Semantics: A Case Study
The spontaneous gestures that accompany spoken language are particularly suited to conveying spatial information, yet their briefness, individuality, and lack of conventional linguistic structure impede their integration into NLU systems. The current work characterizes spontaneous size gestures in a manual task corpus, clarifying their form, discourse role and representation as a first step tow...
متن کاملA survey of qualitative spatial representations
Representation and reasoning with qualitative spatial relations is an important problem in artificial intelligence and has wide applications in the fields of geographic information system, computer vision, autonomous robot navigation, natural language understanding, spatial databases and so on. The reasons for this interest in using qualitative spatial relations include cognitive comprehensibil...
متن کاملAssessment of Spatial Multi-Criteria Decision-Making with Process of the Artificial Neural Networks Method to Site Selection of the Wastewater Treatment Plant (Case Study: Qeshm Island)
Wastewater treatment technology in the cyclic nature of the process that takes a long time. But man tries to rush to their needs with experience and understanding of the natural processes of interaction, and using technology to build their Industrial development is authorized. Sewage treatment reed have been born from the vision of man's increasing need to water daily decreases the natural reso...
متن کاملAssessment of Spatial Multi-Criteria Decision-Making with Process of the Artificial Neural Networks Method to Site Selection of the Wastewater Treatment Plant (Case Study: Qeshm Island)
Wastewater treatment technology in the cyclic nature of the process that takes a long time. But man tries to rush to their needs with experience and understanding of the natural processes of interaction, and using technology to build their Industrial development is authorized. Sewage treatment reed have been born from the vision of man's increasing need to water daily decreases the natural reso...
متن کاملSpatial Relations Between 3D Objects: The Association Between Natural Language, Topology, and Metrics
With the proliferation of 3D image data comes the need for advances in automated spatial reasoning. One specific challenge is the need for a practical mapping between spatial reasoning and human cognition, where human cognition is expressed through naturallanguage terminology. With respect to human understanding, researchers have found that errors about spatial relations typically tend to be me...
متن کامل